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Abstract. We extend a technique for lower-bounding the mixing time of card-shuffling 
Markov chains, and use it to bound the mixing time of the Rudvalis Markov chain, as 
well as two variants considered by Diaconis and Saloff-Coste. We show that in each case 
8(n 3 logn) shuffles are required for the permutation to randomize, which matches (up to 
constants) previously known upper bounds. In contrast, for the two variants, the mixing 
time of an individual card is only 0(n 2 ) shuffles. 



1. Introduction 



In earlier work ( |Wilson| , |20U1| ) we derived upper and lower bounds on the mixing time of 



a variety of Markov chains, including Markov chains on lozenge tilings, card shuffling, and 
exclusion processes. The mixing time of a Markov chain is the time it takes to approach its 
stationary distribution, which is often measured in total variation distance (defined below). 
In this article we focus on the method for lower bounding the mixing time, and extend its 
applicability to the Rudvalis card shuffling Markov chain (defined below) and related shuffles. 

Let P* 1 denote the distribution of the Markov chain started in state x after it is run for t 
steps, and let /i denote the stationary distribution of the Markov chain. The total variation 
distance between distributions P** and /i is defined by 



P? - /i|| TV = max \P:\A) - fi(A)\ = l ~Y. \P?(y) - M*/) I = \ ll P 



i • 



V 



and the mixing time is the time it takes for max x \\P* — /u|| TV to become small, say smaller 
than e. See ( |Aldous and Fill| , |2005| ; piaconis] , 1988| , |1996|) for further background. 



Arunas Rudvalis proposed the following shuffle: with probability 1/2 move the top card 
to the bottom of the deck, and with probability 1/2 move it to the second position from 
the bottom. [Hildebrarid| (|1990|) showed that the Rudvalis shuffle mixes in 0(n 3 logn) time. 



Diaconis and Saloff-Coste| ( |1995| ) studied a variation, the shift-or-swap shuffle, which at 



each step either moves the top card to the bottom of the deck or exchanges the top two 
cards, each move with probability 1/2. |Diaconis and Saloff-Coste] ( |1993| ) also studied a 



symmetrized version of the Rudvalis shuffle, which at each step does one of four moves each 
with probability 1/4: move top card to bottom, move bottom card to top, exchange top two 
cards, or do nothing. In each case a 0(n 3 log n) upper bound on the mixing time is known, 
but order n 3 logn lower bounds were not known. 

To lower bound the mixing time, one finds a set A of states such that P* t (A) is close to 
1 and n(A) is close to 0. The approach taken in ( |Wilson| , |2001| ) uses an eigenvector $ of 
the Markov chain. If X t denotes the state of the Markov chain at time t, then E[$(X t+ i) | 
X t ] = A$(X t ). To obtain a good lower bound, we need A < 1 but A ~ 1. Since A < 1, 
in stationarity E[$(X)] = 0, but since A ~ 1, it takes a long time before E[$(X t )] rs 0. 
If furthermore the eigenvector is "smooth" in the sense that E[|$(X t+1 ) — $(X t )| 2 | X t ] is 
never large, then we can bound the variance <$>(X t ), showing that it is with high probability 
confined to a small interval about its expected value. Then provided that E[$(X 4 )] is large 
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enough, we can reliably distinguish from $(JT) in stationarity, which implies that the 

Markov chain has not yet mixed by time t. |Saloff-Coste| ( |2002| ) gives an exposition of this 
and related ideas. 

For the Rudvalis card shuffling Markov chain and its variants, there are a few difficulties 
when directly applying this approach to lower bound the mixing time. The eigenvectors 
that one wants to use are complex-valued rather than real-valued, and &(X t ) is no longer 
confined to a small interval around E[$(X t )]. Instead what happens is that &(X t ) is with 
high probability confined to a narrow annulus centered at 0. E[$(X t )] becomes too small 
too quickly, and Var[$(X t )] remains too large to be useful. 

To lower bound the mixing time we want to in effect work with |$(X 4 )| and forget about 
arg$(X t ). To do this we start by lifting the Markov chain to a larger state space, and let 
us denote the state at time t of the lifted chain by (X t , Y t ). (The mixing time of the lifted 
Markov chain will upper bound the mixing time of the original chain, so at the outset it is 
not clear that we can lower bound the mixing time of the original chain by considering its 
lifted version.) We find an eigenvector ^ on the lifted chain such that for all x, y\ and 2/2, 
2/1 ) I = \^(x, 2/2 ) I , so that is well-defined. If we show that ^(X t , Y t ) is with high 

probability close to E[\]/(JYt, Y t )}, which in turn is far from 0, it will follow that is with 

high probability confined to a small interval far from 0, making it statistically distinguishable 
from in stationarity, implying that the Markov chain has not mixed by time t. 

In the following sections we carry out these ideas to obtain lower bounds on the mixing 
time that match (to within constants) the previously obtained upper bounds. Specifically, 
we show 

Theorem 1. For any fixed e > 0, after ^"^ n 3 logn shuffles of the Rudvalis shuffle, 

1 2 °2 1 ' > ^ 3 log^ shuffles of the shift- or- swap shuffle, or — ^- n 3 logn shuffles for the sym- 
metrized Rudvalis shuffle, the distribution of the state of the deck has variation distance 
> 1 — e from uniformity. 

2. Lifting the Shuffles 

When the top card is placed at the bottom of the deck, the position of any given card is 
cycically shifted left, so we will call this move "shift-left", and similarly "shift-right" is the 
move which places the bottom card on top of the deck. The move which exchanges the top 
and bottom cards will be called "swap", and the move which does nothing will be called 
"hold" . Thus the moves of the Rudvalis Markov chain are "shift-left" and "swap & shift-left" , 
while the moves of the variation considered by Diaconis and Saloff-Coste are "shift-left" and 
"swap", and the moves of the symmetrized version are "shift-left", "shift-right", "swap", 
and "hold". 

The state X t of the Markov chain is the permutation giving the order of the cards at 
time t. Let <) (the diamond-suit symbol) denote a particular card of interest, and let X t ((}) 
denote the location of card (} within the deck at time t, where the positions are numbered 
from 1 to n starting from the top of the deck. When the Markov chain does a shift-left 
or shift-right, while the position of a card (} will change, all the cards get moved together, 
so it does not have such a large randomizing effect on the permutation. We will track the 
position of card ^>, but we should also track the amount of shifting. So when we lift the 
Markov chain to (X t , Y t ), the lifted Markov chain will also keeps track of 

Yt — # shift-left's — # shift-right's mod n. 
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For the Rudvalis shuffle Y t — t mod n deterministically, whereas for the other two variations, 
Y t will be a random number between 1 and n which approaches uniformity in 0{n 2 ) time. 

Recall that we need an eigenvector \l/ of the lifted chain (X t , Y t ) such that \^(Xt, Y t ) \ is a 
function of X t alone. For a given card ^>, let 

VoiXt, Y t ) = v(X t (<») exp(Z t (0)2m/n), 

where 

Z t (0) = X t (0) - Xo(0) + Y t mod n, 

and v() is a function, to be determined later, which makes an eigenvector. Initially 
■^o(O) = 0, and the only time that the Z t ((}) changes is when the card <0> gets transposed. 
The dynamics of (X t ((}), Z t ((})) (mod n) are summarized by 



(x t+1 (0),z t+1 «») 



((X t (0),z t (0)) 
(x t (0)-i,z t (0)) 
(x t (0) + i,z t (0)) 
(x t (0) - 1, z t (0) - 

(X t (0) + l,Z t (0)H 

{(x t (0),z t (0)) 



if move was "hold" 

if move was "shift-left" 

if move was "shift-right" 

if move was "swap" and X t ({}) = 1 

if move was "swap" and X t ((}) = n 

if move was "swap" and <£> elsewhere 



We define 



^(X t ,Y t ) = J2^0(Xt,Y). 

0=1 

If we increment y while holding x fixed, then y) gets multiplied by the phase factor 
exp(2iTi/n), so we have an eigenvector satisfying our requirement that Y t )\ be a func- 

tion of X t alone. 



3. The Lower Bound Lemma 



The lower bounding lemma that we shall use is similar to Lemma 4 of ( |Wilson| , [200 1| ). 
but with the modifications described in the introduction. |Saloft'-Costej ( P002j) also gives a 
generalization of Lemma 4 from ( |Wilson| , |2001| ) that may be used when the eigenvalues are 



complex, but the extension below seems to be better suited for the shuffles considered here. 

Lemma 2. Suppose that a Markov chain X t has a lifting (X t ,Y t ), and that \1/ is an eigen- 
function of the lifted Markov chain: E[\E'(X t+1 , Y t+ i) \ (X t ,Y t )] = \^/(X t ,Y t ) . Suppose that 
\^/(x,y)\ is a function of x alone, |A| < 1, 3?(A) > 1/2, and that we have an upper bound R 
on E[\V(X t+ i,Y t+l ) - m(X u Y t )\ 2 \ (X t ,Y t )}. Let 7 = 1 - 9fc(A). Then when the number of 
Markov chain steps t is bounded by 



t < 



log^ r 



2 1U & 4R 



-log(l-7) 

the variation distance of X t (the state of the original Markov chain) from stationarity is at 
least 1 — e. 



The proof of this modified lemma is similar to the proof of Lemma 4 in ( |Wilson| , [200 1| ) . 
but for the reader's convenience we give the modified proof. In the following sections we 
determine the functions v () for the Markov chains which give the requisite eigenfunction 
and then use Lemma |^ to obtain the mixing time bounds stated in Theorem [I]. 
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Proof of Lemma §. Let * t = ^(X t , Y t ), and = % +1 - # t . By induction 

E[% | (X ,F )] = ^oA*. 

By our assumptions on A, in equilibrium E[\&] = 0. 
We have E[A# | (X t , Y t )\ = (A - l)tf t and 

^t+i^* t+1 = Vt&t + ^A^* + ^*A^ + |A^| 2 
E[# m #* +1 | (X t , Y t )} = * t **[l + (A - 1)* + (A - 1)] + E[| A^| 2 | X t 

< ^*[2K(A) - 1] +i2 



and so by induction, 



2 - 25R(A) ' 



then subtracting E[* t ]E[* 



Var[* t ] < V % [[2U(X) - if - (AA*)<] 



2-2K(A)" 

Since (1 - A)(l - A*) > 0, we have AA* > 23?(A) - 1, and by assumption 2K(A) - 1 > 0. 
Hence (AA*)* > [23?(A) — 1]*, and we have for each t 

From Chebychev's inequality, 

Pr 



|* t -E[* t ]| > y/R/fre) 



< e. 



As E[^oo] = 0, if E[%] > y/AR/de), then the probability that \%\ deviates below y/R/(je) 
is at most e/2, and the probability that |^| in stationarity deviates above this threshold is 
at most e/2, so the variation distance between the distribution at time t and stationarity 
must be at least 1 — e. If we take the initial state to be the one maximizing ^> Q , then 

E[|^|] = |^ max ||A|* > |# max |(^(A))' = |tf max |(l - 7)* > 



when 

log 
t < 



* max 



-log(l -7) 
4. The Rudvalis Shuffle 



□ 



The first shuffle we consider is the original Rudvalis Markov chain. It will be instructive to 
consider a slight generalization, where the swap & shift-left move takes place with probability 
p, and the shift-left move takes place with probability 1 — p. We shall assume that < p < 1 
and that p is independent of n. The particular values of p that we are interested in are 
p = 1/2 (for the original Rudvalis chain) and p = 1/3. 

We need to find an eigenvector for the random walk that a single card takes under this 
shuffle. We remark that this random walk is similar in nature to (but distinct from) a class 
of random walks, known as daisy chains, for which Wilmer ( |199iJ| ) obtained eigenvalues and 
eigenvectors. From other work of |Wilmer| ( |2002[ ), it readily follows that the position of a 
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single card takes order n 3 steps to randomize, and that the precise asymptotic distance 
from stationarity of the card's position after cn 3 shuffles is given by an explicit expression 
involving theta functions. 

Lemma 3. The random walk followed by a card <0 under the lifted Rudvalis shuffle has an 
eigenvector of the form 

^(x,z) = v(x)e 2niz/n 
where v(x) is the X th number in the list 

\ n 2 , . . . , A 2 , A, 1, x , 

the eigenvalue is 



A=l--E-^ + (l/A 
and 

X = l + T^— + 0(l/n 2 ). 
1 — p n 

Proof. Let w = exp(27ri/n). If at time t card <0 is in any location between 2 and n—1, then 

*o(x t+1 ,y t+1 ) = A* <> (x t ,y t ) 

deterministically. To ensure that 

E[*<>(x t+1 ,y t+1 ) | (x t ,Y t )} = A*<>(x t ,y t ) 

when X((<}) = 1, we require 

pin' 1 + (l-p)x = A n_1 

\n-l _ pyj- 1 

X ~, i 

1-p 

and for when X t ((}) = n we need 

pwx+ (1-p) = Xx 

1 — p 

X = T • 

A — pw 

Given these two equations, will be an eigenvector with eigenvalue A. Thus, 

/(A) = A" - pw\ n - 1 - pw^X - 1 + 2p = 0. 

To identify a root of this polynomial, we use Newton's method: z k +i — z k — f(z k )/f ; (zk), 
starting with z = 1. By Taylor's theorem, 



|/(*fc+i)| < J max |/"(uz fc + (1 - u)**+i)l x 

2 0<«<1 



/(**) 



2 



I /'(**) 

If \z — 1| < 1/n 2 , then f'(z) = (1 — p)n + 0(1) and f"(z) = (1 — p)n 2 + 0(n). Consequently, 
if \zk — 1| < 1/ri 2 and — 1| < 1/n 2 , then 

, /(2t+l) l < l±^M_l_, /wr 

Since f(z ) = p(2 — w — w~ 1 ) = pAn 2 /n 2 + 0(l/n 4 ), for large enough n we have by induction 
that |/(^)| < (l-p)(p/(l-p)4n 2 /n 2 ) 2k , \z k+1 -z k \ < (p/(l - p)4n 2 /n 2 ) 2 " /(n + 0(1)), and 
\zk+i — zq\ < 0(l/n 3 ). Thus, for large enough n, the sequence zq,z±,Z2, ■ ■ ■ converges to a 
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point A, which by continuity, must be a zero of /. Since Z\ = l—p/(l — p)ATT 2 /n 3 + 0(l/n 4 ) 
and | A — z±\ = 0(l/n 5 ), we conclude that the polynomial / has a root at 



^(i t+ll y t+1 )-^(i t) y t ; 



w 



z t (0) 



A = l--^^ + 0(l/n 4 ). □ 
1 — p n 6 

It is easy to check that \l/ ma x = n + 0(l/n). Next we evaluate R for this eigenvector. 

f (A - l)A Xt(<>) = 0(l/n 3 ) if 2<X t (0) <n-l 

x - A™- 2 = 0(l/n) if X t (0) = 1, shift-left 

u; -i _ a™- 2 = 0(1/ n) if X t (0) = 1, swap & shift-left 

1 - X = 0(l/n) if X t (0) = n, shift-left 

, W X ~ X — 0(l/n) if X t ({)) = n, swap & shift-left 

Adding up these contributions over the various cards <0>, we find 

\y(X t+1 ,Y t+1 )-*(X t ,Y t )\ <0(l/n) 

i? = E[|*(x m ,r m ) -^(x t ,r 4 )l 2 1 < o(i/n 2 ). 

Plugging A, \l/max, and R into the Lemma |2] gives, for fixed values of e, a mixing time lower 
bound of 

(1_ (1))I^ * 3 b 
P 87T 2 

5. The Shift-or-Swap Shuffle 

At this point there are two ways we can approach the shift-or-swap shuffle. We can 
either take a direct approach in the same manner as in the previous section, or we can do a 
comparison with the Rudvalis shuffle with p = 1/3. 

If we take the direct approach, then we let v(x) denote the x th element of the list 

(2A-ir- 2 ,...,(2A-l) 2 ,2A-l,l, X . 

The constraints on x are 
and 

l+w(2X- l) n - 2 



X 



2A 

As in section ||, we solve for A and find that A = 1 — (1 + o(l))7r 2 /n 3 , compute ^ max = 0(n) 
and R = 0(l/n 2 ), and obtain the mixing time lower bound of 1 n 3 log n shuffles. 

Alternatively, we can couple the shift-or-swap shuffle with the Rudvalis shuffle. Whenever 
the shift-or-swap shuffle makes a shift, the number of swap's since the previous shift will be 
odd with probability 1/3. If it is odd, then this is equivalent to a swap-&-shift-left move, 
and if it is even, then it is equivalent to a shift-left move. This explains why we were 
interested in the case p = 1/3 in the previous section. After t steps, with high probability 
(l + o(l))t/2 shift moves occured, which means that the state of the deck is what it would be 
after (1 + o(l))t/2 Rudvalis shuffles (with p = 1/3), possibly with an extra swap move. The 
lower bound for the shift-or-swap shuffle does not follow from the lower bound itself for the 
Rudvalis shuffle, but it does follow from what we showed about 1^1 for the Rudvalis shuffle. 
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6. Symmetrized Version of the Rudvalis Shuffle 

When analyzing the symmetrized version of the Rudvalis shuffle, it will be convenient to 
have symmetric coordinates, so we re-index the card locations to run from — (n — 1 )/2 up to 
(n — l)/2, and the swaps occur at locations — (n — l)/2 and (n — l)/2. 

Lemma 4. The random walk followed by a card <) under the lifted symmetrized Rudvalis 
shuffle has an eigenvector of the form 

^ <> {x,z)=v{x)e 2mz/n 



where 



v(x) 



1 + 



-%0x 



2 2 

= (f +o(f))v / 27m- 3/2 , 
f 



cos(9x) + i5sin(9x), 



* = (i + o(i)) 

and both 5 and 9 are real. The eigenvalue is 

1 + cos 9 
A " 2 " 2 
Proof. When i ^ ±(n- l)/2, we can readily compute the eigenvalue A to be 



y/271 1 / 2 ' 



7T 2 + 0(f) _ 3 



A = 



+ 1) + \v{x) + - f ) 
v(x) 

1 cos(^x) cos 9 + i5 sin(9x) cos 6? 

2 cos(9x) + i5 sm(9x) 
1 + cos 9 



1 

+ 2 



In order for our guessed eigenvector to be correct, there is also a constraint at x — (n — l)/2: 



' n— 1 ' 



+ ^LM + (!+«,) ^ 2 ' 



4 4 v ' 4 

(1 + iw)u(-(n - l)/2) = v((n - l)/2) + u((n + l)/2) 

(f + w)(l + S)e~ i6in ~ 1)/2 + (l + w )(l- 5)e ie{n - 1)/2 = (1 + <?)e* (n - 1)/2 + (1 - 5)e~ ie{n - 1)/2 + 

(1 + 5 ) e ^(n+l)/2 + (1 _ 5 y-ie(n+l)/2 
'I + 5 ) e ^(n+l)/2 + (1 _ 5)e -^(n+l)/2 



(w + 25 + w5)e~ ie(n ~ 1)/2 + (w - 25 - w5)e ie ^ )/2 



0(n-l) 0(n+l) 
w cos v „ ; — COS v „ ; 



(2 + w)i sin 



(n-l) , • • 0(n+l) 

+ i sin v ^ 



The corresponding constraint at x = — (n — l)/2 is obtained by replacing w with 1/w and 
replacing # with —9. Since these substitutions give the complex-conjugate of the above 
equation, the constraints at x — ±(n — l)/2 are equivalent. 
Equating the real parts of this equation gives 



5 = 



2tt 0(n-l) 
COS — COS - „ - 
n 2 



COS 



0(n+l) 



_ s i n ^ siri fcl) 

n 2 
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and equating the imaginary parts gives 

S = — 



■ 2tt 9(n-l) 

sin — cos v n ' 

n 2 



o • 0(n-l) . 2tt • 0(n-l) , • 9(n+l) ' 

2 sin v 2 ' + cos ^ sin ^ 2 ; + sm v 2 ' 



Cross-multiplying and performing trigonometric simplifications gives 
„-„2^ sin (fl("- !)) _ _2^ sin (#( n - !)) sin(0(n + l)) 

Sill i ) „ 



COS 



2 2 



+ cos(f ) sin(^(n - 1)) - 2 sin cos 



+ cos(^)sin# 
gfa+i) 



so that 
(1) 



1 2tt 
— h cos — 

2 n 



sin(#(n — 1)) sin(0(n + 1)) — sin(#n) 



2ir 

1 + cos — ] sin(6 l ) = 0. 

n 



Equation ([!]) is exact, but to estimate a solution, we perform a series expansion in 9 

2tt" 









2tt\ 


[G 


+ cos — 




n J 



(n - 1) - (n + l)/2 - n + 1 + cos ■ 



n 









2tt\ 


[G- 


- cos — 1 




n / 



in + l) 3 /2 - n 3 + 1 + cos — 

n 



9 3 

T + 0(n 4 9 5 ) 

D 



= - 



2vr 2 



n 



0(l/n 2 



n3 

6+ \Qn 2 + 0(n)] — + 0(n 4 9 5 ). 
6 



While 9 = is a solution, our expression for 5 has a singularity at 6* = 0, so we seek a 
different solution. Ignoring the error terms suggests 9 = v2irn~ 3 / 2 . Since the function in 
(H) is real-valued, we can appeal to the intermediate value theorem to show that there is in 
fact a root at 

9 = V2nn- 3 / 2 . 

For this value of 9 we have 

1 + COS 9 . „ 7T 2 _ 3 



A 



and (using the second equation for S) 



1 n 

2 



2n/n 



1 



□ 



2n9/2 + n9/2 + n9/2 y/2n x l 2 ' 

Again \l/ max = (1 + o(l))n. Next we estimate R. If there is a shift-left, then provided 
card ^> is not in position — in — l)/2, we have 

Atf<> = (costf - l)[cos(0a?) + i5sm{9x)]e 2niz/n + sin0[sin(to) - i5 cos{9x)]e 2wiz/n 

= 0(9 2 ) + 0(9 2 x) + 0(95) = 0(n- 3 ) + 0( n - 2 ) + 0( n - 2 ) = 0( n - 2 ). 

If card <0 is in position — (n — l)/2, then 

A^<> = 2iSmn(6{n - l)/2)e 2wiz/n = O^" 1 ). 

Adding up these contributions over the different cards, we find A\& = 0(n~ 1 ). Likewise 
A\& = 0(n~ 1 ) if the move was a shift-right. For transposes, A\E^ is nonzero for only 
two cards, and for these it is 0(n~ l ). Thus in all cases we have |A^| 2 < 0(n~ 2 ), and so 
R < 0{n~ 2 ). Plugging our values of A, $ max , and R into Lemma 0, we obtain, for fixed 



values of e, a lower bound on the mixing time of 



l-o(l)^3 



n logn shuffles. 
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7. Remarks 



We have seen how to extend the lower bound technique used in (|Wilson| , |2001|) to shuffles 



that are much slower than what the position of a single card would indicate. Interestingly, 
for the shift-or-swap and symmetrized-Rudvalis shuffles, the spectral gap for the lifted shuffle 
is smaller than the spectral gap of the shuffle itself, so it is curious that we obtained a lower 
bound for these shuffles by considering their lifted versions. 

In an early draft, we lower bounded the mixing time of the original Rudvalis shuffle without 
considering its lifted version, and this earlier approach might be considered simpler. But it 
is not clear how to lower bound the symmetrized Rudvalis shuffle without lifting it, and our 
current approach has the advantage that the analyses for all three shuffles treated here are 
similar. The original Rudvalis shuffle and its lifting are isomorphic, and the earlier analysis 
is effectively a special case of the present analysis where the lifting is not explicit. 

We suspect that the constants in the lower bounds of Theorem [I] are tight. Rudvalis asked 
if his shuffle was the slowest shuffle evenly supported on two generators; the lower bounds 
given here suggest that the shift-or-swap shuffle (for odd n) is slower by a factor of 4. 
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